Goto

Collaborating Authors

 graph coloring


Graph Coloring via Neural Networks for Haplotype Assembly and Viral Quasispecies Reconstruction

Neural Information Processing Systems

Understanding genetic variation, e.g., through mutations, in organisms is crucial to unravel their effects on the environment and human health. A fundamental characterization can be obtained by solving the haplotype assembly problem, which yields the variation across multiple copies of chromosomes. Variations among fast evolving viruses that lead to different strains (called quasispecies) are also deciphered with similar approaches. In both these cases, high-throughput sequencing technologies that provide oversampled mixtures of large noisy fragments (reads) of genomes, are used to infer constituent components (haplotypes or quasispecies). The problem is harder for polyploid species where there are more than two copies of chromosomes. State-of-the-art neural approaches to solve this NP-hard problem do not adequately model relations among the reads that are important for deconvolving the input signal. We address this problem by developing a new method, called NeurHap, that combines graph representation learning with combinatorial optimization. Our experiments demonstrate the substantially better performance of NeurHap in real and synthetic datasets compared to competing approaches.


Scalable design of Error-Correcting Output Codes using Discrete Optimization with Graph Coloring

Neural Information Processing Systems

We study the problem of scalable design of Error-Correcting Output Codes (ECOC) for multi-class classification. Prior works on ECOC-based classifiers are limited to codebooks with small number of rows (classes) or columns, and do not provide optimality guarantees for the codebook design problem. We address these limitations by developing a codebook design approach based on a Mixed-Integer Quadratically Constrained Program (MIQCP). This discrete formulation is naturally suited for maximizing the error-correction capability of ECOC-based classifiers and incorporates various design criteria in a flexible manner. Our solution approach is tractable in that it incrementally increases the codebook size by adding columns to maximize the gain in error-correcting capability.


Graph Coloring via Neural Networks for Haplotype Assembly and Viral Quasispecies Reconstruction

Neural Information Processing Systems

The pseudocode for the NeurHap-refine is as follows: Algorithm 1: The Local Refinement Algorithm NeurHap-refine. Two categories of datasets are used in the paper, Polyploid species and Viral Quasispecies . BW A-MEM [Li, 2013] is used to align reads to the reference genome. The detailed command is (take the 15-strain ZIKV as an example): $ ./bwa Vikalo, 2020a,b] to derive the SNP matrix from the above alignment to ensure a fair comparison.


EHOP: A Dataset of Everyday NP-Hard Optimization Problems

Duchnowski, Alex, Pavlick, Ellie, Koller, Alexander

arXiv.org Artificial Intelligence

We introduce the dataset of Everyday Hard Optimization Problems (EHOP), a collection of NP-hard optimization problems expressed in natural language. EHOP includes problem formulations that could be found in computer science textbooks, versions that are dressed up as problems that could arise in real life, and variants of well-known problems with inverted rules. We find that state-of-the-art LLMs, across multiple prompting strategies, systematically solve textbook problems more accurately than their real-life and inverted counterparts. We argue that this constitutes evidence that LLMs adapt solutions seen during training, rather than leveraging reasoning abilities that would enable them to generalize to novel problems.


Self-Supervised Transformers as Iterative Solution Improvers for Constraint Satisfaction

Xu, Yudong W., Li, Wenhao, Sanner, Scott, Khalil, Elias B.

arXiv.org Artificial Intelligence

We present a Transformer-based framework for Constraint Satisfaction Problems (CSPs). CSPs find use in many applications and thus accelerating their solution with machine learning is of wide interest. Most existing approaches rely on supervised learning from feasible solutions or reinforcement learning, paradigms that require either feasible solutions to these NP-Complete CSPs or large training budgets and a complex expert-designed reward signal. To address these challenges, we propose ConsFormer, a self-supervised framework that leverages a Transformer as a solution refiner. ConsFormer constructs a solution to a CSP iteratively in a process that mimics local search. Instead of using feasible solutions as labeled data, we devise differentiable approximations to the discrete constraints of a CSP to guide model training. Our model is trained to improve random assignments for a single step but is deployed iteratively at test time, circumventing the bottlenecks of supervised and reinforcement learning. Our method can tackle out-of-distribution CSPs simply through additional iterations.


Graph Coloring via Neural Networks for Haplotype Assembly and Viral Quasispecies Reconstruction

Neural Information Processing Systems

Understanding genetic variation, e.g., through mutations, in organisms is crucial to unravel their effects on the environment and human health. A fundamental characterization can be obtained by solving the haplotype assembly problem, which yields the variation across multiple copies of chromosomes. Variations among fast evolving viruses that lead to different strains (called quasispecies) are also deciphered with similar approaches. In both these cases, high-throughput sequencing technologies that provide oversampled mixtures of large noisy fragments (reads) of genomes, are used to infer constituent components (haplotypes or quasispecies). The problem is harder for polyploid species where there are more than two copies of chromosomes.


Scalable design of Error-Correcting Output Codes using Discrete Optimization with Graph Coloring

Neural Information Processing Systems

We study the problem of scalable design of Error-Correcting Output Codes (ECOC) for multi-class classification. Prior works on ECOC-based classifiers are limited to codebooks with small number of rows (classes) or columns, and do not provide optimality guarantees for the codebook design problem. We address these limitations by developing a codebook design approach based on a Mixed-Integer Quadratically Constrained Program (MIQCP). This discrete formulation is naturally suited for maximizing the error-correction capability of ECOC-based classifiers and incorporates various design criteria in a flexible manner. Our solution approach is tractable in that it incrementally increases the codebook size by adding columns to maximize the gain in error-correcting capability.


Graph Coloring Using Heat Diffusion

Chaudhary, Vivek

arXiv.org Artificial Intelligence

Graph coloring is a problem with varied applications in industry and science such as scheduling, resource allocation, and circuit design. The purpose of this paper is to establish if a new gradient based iterative solver framework known as heat diffusion can solve the graph coloring problem. We propose a solution to the graph coloring problem using the heat diffusion framework. We compare the solutions against popular methods and establish the competitiveness of heat diffusion method for the graph coloring problem.


Reinforcement Learning for Graph Coloring: Understanding the Power and Limits of Non-Label Invariant Representations

Cummins, Chase, Veras, Richard

arXiv.org Artificial Intelligence

Register allocation is one of the most important problems for modern compilers. With a practically unlimited number of user variables and a small number of CPU registers, assigning variables to registers without conflicts is a complex task. This work demonstrates the use of casting the register allocation problem as a graph coloring problem. Using technologies such as PyTorch and OpenAI Gymnasium Environments we will show that a Proximal Policy Optimization model can learn to solve the graph coloring problem. We will also show that the labeling of a graph is critical to the performance of the model by taking the matrix representation of a graph and permuting it. We then test the model's effectiveness on each of these permutations and show that it is not effective when given a relabeling of the same graph. Our main contribution lies in showing the need for label reordering invariant representations of graphs for machine learning models to achieve consistent performance.


Constraint and Satisfiability Reasoning for Graph Coloring

Hebrard, Emmanuel (LAAS-CNRS, Université de Toulouse, CNRS, France) | Katsirelos, George (UMR MIA-Paris, INRA, AgroParisTech, Université Paris-Saclay, France)

Journal of Artificial Intelligence Research

Graph coloring is an important problem in combinatorial optimization and a major component of numerous allocation and scheduling problems. In this paper we introduce a hybrid CP/SAT approach to graph coloring based on the addition-contraction recurrence of Zykov. Decisions correspond to either adding an edge between two non-adjacent vertices or contracting these two vertices, hence enforcing inequality or equality, respectively. This scheme yields a symmetry-free tree and makes learnt clauses stronger by not committing to a particular color. We introduce a new lower bound for this problem based on Mycielskian graphs; a method to produce a clausal explanation of this bound for use in a CDCL algorithm; a branching heuristic emulating Brélaz' heuristic on the Zykov tree; and dedicated pruning techniques relying on marginal costs with respect to the bound and on reasoning about transitivity when unit propagating learnt clauses. The combination of these techniques in both a branch-and-bound and in a bottom-up search outperforms other SAT-based approaches and Dsatur on standard benchmarks both for finding upper bounds and for proving lower bounds.